Semi-Automatic Entity Set Refinement
نویسندگان
چکیده
State of the art set expansion algorithms produce varying quality expansions for different entity types. Even for the highest quality expansions, errors still occur and manual refinements are necessary for most practical uses. In this paper, we propose algorithms to aide this refinement process, greatly reducing the amount of manual labor required. The methods rely on the fact that most expansion errors are systematic, often stemming from the fact that some seed elements are ambiguous. Using our methods, empirical evidence shows that average R-precision over random entity sets improves by 26% to 51% when given from 5 to 10 manually tagged errors. Both proposed refinement models have linear time complexity in set size allowing for practical online use in set expansion systems.
منابع مشابه
Fine-grained Entity Set Refinement with User Feedback
State of the art semi-supervised entity set expansion algorithms produce noisy results, which need to be refined manually. Sets expanded for intended fine-grained concepts are especially noisy because these concepts are not well represented by the limited number of seeds. Such sets are usually incorrectly expanded to contain elements of a more general concept. We show that fine-grained control ...
متن کاملConcept Detector Refinement on Social Videos
The explosion of the social video sharing sites gives new challenges on video search and indexing technique. Because of the concept diversity in social videos, it is very hard to build a well annotated dataset that provides good coverage over the whole meaning of concepts. However, the prosperity of social video also make it easy to obtain a huge number of videos, which gives an opportunity to ...
متن کاملHeuristics on the Definition of UML Refinement Patterns
In this article we present a strategy to formalize frequently occurring forms of refinement that take place in UML model construction. Such strategy consists in recognizing a set of well founded refinement structures in a formal language which are then immersed into a UML-based development, giving origin to a set of UML refinement patterns. Apart from providing semi-formal evidence on the prese...
متن کاملEASEAndroid: Automatic Policy Analysis and Refinement for Security Enhanced Android via Large-Scale Semi-Supervised Learning
Mandatory protection systems such as SELinux and SEAndroid harden operating system integrity. Unfortunately, policy development is error prone and requires lengthy refinement using audit logs from deployed systems. While prior work has studied SELinux policy in detail, SEAndroid is relatively new and has received little attention. SEAndroid policy engineering differs significantly from SELinux:...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009